Thai Text Coherence Structuring with Coordinating and Subordinating Relations for Text Summarization
نویسندگان
چکیده
Text summarization with the consideration of coherence can be achieved by using discourse processing with the Rhetorical Structure Theory (RST). Additional problems on relational ambiguity may arise, especially in Thai. For example, the use of cue words, i.e. “tae/แต่” (meaning “but”), can be identified as a contrast relation or an elaboration relation. Therefore, we propose the reduction of the ambiguity level by reducing the relation types to two, namely Coordinating and Subordinating relation. Our framework is to concentrate on coherence structuring which requires the following 3 steps: (1) identify an attachment point for an incoming discourse unit by using our Adaptive Rightfrontier algorithm; (2) extract Coordinating and Subordinating relations through the identification of linguistic coherence features in the lexical and phrasal level, using Bayesian techniques; (3) construct coherence tree structures, The accuracy is 70.45% for the first step, 77.47% and 79.89% for COR and SUBR extraction respectively in the second step and 64.94% in constructing coherent tree of the third.
منابع مشابه
An Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches
Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...
متن کاملDual function of first position nominal groups in research article titles: Describing methods and structuring summary
Previous research has identified the nominal group as the most distinctive feature of the research article title. In contrast, the findings reported in this paper suggest Theme/Rheme is the dominant structure in title text. Theme/Rheme structures order and tie nominal groups in titles. When a title starts with a methodological term the first position nominal group acts as a theme marker. Thus, ...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملSystematic literature review of fuzzy logic based text summarization
Information Overloadrq is not a new term but with the massive development in technology which enables anytime, anywhere, easy and unlimited access; participation & publishing of information has consequently escalated its impact. Assisting userslq informational searches with reduced reading surfing time by extracting and evaluating accurate, authentic & relevant information are the primary c...
متن کاملText Summarization Using Cuckoo Search Optimization Algorithm
Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...
متن کامل